Review of "Data Mining: Practical Machine Learning Tools and Techniques" by Witten and Frank
نویسنده
چکیده
In the early 1990s some sectors of the computer science community were developing the idea of data understanding as a discovery-driven, systematic and iterative process. This "data mining" research and development area was expected to take advantage of the expansion and consolidation of machine learning methodologies together with the integration of traditional statistical analysis and database management strategies. The main goal was to identify relevant, interesting and potentially novel informational patterns and relationships in large data sets to support decision making and knowledge discovery. In the mid 1990s developers and users of decision-making support systems in areas such as finance (e.g. credit approval and fraud detection applications), marketing and sales analysis (e.g. shopping patterns and sales prediction) were showing a great deal of enthusiasm about the business value of data mining applications. During the next few years international conferences, journals and books were more frequently reporting advances, tools and applications in other areas such as biomedical informatics, engineering, physics, law enforcement and agriculture. Today data mining is seen as a discipline or paradigm that actively aids in the development of these and other scientific areas (e.g. Web-based computing and systems biology). Data mining has become a fundamental research topic in the progression of computing applications in health care and biomedicine. Advances in data mining have applications and implications in areas ranging from information management in healthcare organisations, consumer health informatics, public health and epidemiology, patient care and monitoring systems, large-scale image analysis to information extraction and classification of scientific literature [1]. Approaches, techniques and applications associated with data mining has also significantly supported different data understanding and decision support tasks in bio-signal processing, such as the classification, visualisation and identification of complex relationships between diagnostic variables or groups of patients [2,3].
منابع مشابه
Data Mining using Learning Classifier Systems
[118] Stewart W. Wilson. Compact Rulesets from XCSI. In Lanzi et al. [73], pages 196–208.[119] Ian H. Witten and Eibe Frank. Data Mining: practical machine learning tools and techniques with java implementations. Morgan Kaufmann, 2000. [120] G. Zweiger. Knowledge discovery in gene-expression-microarray data: mining theinformation output of the genome. Trends in Biotechnology, 17:429...
متن کاملData Mining: Practical Machine Learning Tools and Techniques with Java Implementations
Witten and Frank's textbook was one of two books that I used for a data mining class in the Fall of 2001. The book covers all major methods of data mining that produce a knowledge representation as output. Knowledge representation is hereby understood as a representation that can be studied, understood, and interpreted by human beings, at least in principle. Thus, neural networks and genetic al...
متن کاملWeka: Practical Machine Learning Tools and Techniques with Java Implementations
The Waikato Environment for Knowledge Analysis (Weka) is a comprehensive suite of Java class libraries that implement many state-of-the-art machine learning and data mining algorithms. Weka is freely available on the World-Wide Web and accompanies a new text on data mining [1] which documents and fully explains all the algorithms it contains. Applications written using the Weka class libraries ...
متن کاملA Machine Learning Workbench for Data Mining
The Weka workbench is an organized collection of state-of-the-art machine learning algorithms and data preprocessing tools. The basic way of interacting with these methods is by invoking them from the command line. However, convenient interactive graphical user interfaces are provided for data exploration, for setting up large-scale experiments on distributed computing platforms, and for design...
متن کاملCreating chemo- & bioinformatics workflows, further developments within the CDK-Taverna Project
References 1. Oinn T, Addis M, Ferris M, Marvin D, Senger M, Greenwood M, Carver T, Glover K, Pocock M, Wipat A, Li P: Taverna: A tool for the composition and enactment of bioinformatics workflows. Bioinformatics 2004, 20(17):3045-3054. 2. Steinbeck C, Han YQ, Kuhn S, Horlacher O, Luttmann E, Willighagen E: The Chemistry Development Kit (CDK): An open-source Java library for chemoand bioinforma...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- BioMedical Engineering OnLine
دوره 5 شماره
صفحات -
تاریخ انتشار 2006